空气污染对21世纪的可持续环境条件构成了严重威胁。它在确定城市环境中的健康和生活水平方面的重要性只会随着时间而增加。从人工排放到自然现象的各种因素是空气污染水平上升的主要因果因素或影响者。但是,缺乏涉及主要人为因素和自然因素的大规模数据阻碍了对不同空气污染物变异性的原因和关系的研究。通过这项工作,我们介绍了一个大规模的城市数据集,以探索这些代理商之间的关系。我们还引入了基于变压器的模型-Cossquareformer,以解决污染物水平估计和预测问题。我们的模型优于此任务的大多数基准模型。我们还通过我们的模型和其他方法来分析和探索数据集,以提出重要的推论,从而使我们能够在更深层次的水平上了解因果药的动态。通过我们的论文,我们寻求为对该领域的进一步研究提供大量基础,以在不久的将来需要我们的批判性关注。
translated by 谷歌翻译
在过去几年中开发的自我监督的学习和预训练策略尤其是卷积神经网络(CNNS)。重点应用这些方法也可以为图形神经网络(GNNS)没有统治。在此纸纸中,我们使用了一种基于图的自我监督信息,具有不同的丢失功能(条 - 低双胞胎[Zbontaret Al。,2021],HSIC [Tsaiet Al。,2021],Vicrog [Bardeset al。,2021])有前途的结果,当Pnnspreighly应用时。我们还提出了一种混合损失,将VICREG ANDHSIC的优势结合起来,称为vicreghsic。当施加到7种不同的数据集时,这些上述方法的穿孔伴有了诸如蛋白质蛋白质,蛋白质,IMDB-二进制等。前衰竭表明,我们形成的杂交损失函数优于4例中的4例中的剩余损失函数。此外,还探讨了不同批量,投影仪尺寸和数据增强的影响。
translated by 谷歌翻译
A hallmark of human intelligence is the ability to learn new concepts purely from language. Several recent approaches have explored training machine learning models via natural language supervision. However, these approaches fall short in leveraging linguistic quantifiers (such as 'always' or 'rarely') and mimicking humans in compositionally learning complex tasks. Here, we present LaSQuE, a method that can learn zero-shot classifiers from language explanations by using three new strategies - (1) modeling the semantics of linguistic quantifiers in explanations (including exploiting ordinal strength relationships, such as 'always' > 'likely'), (2) aggregating information from multiple explanations using an attention-based mechanism, and (3) model training via curriculum learning. With these strategies, LaSQuE outperforms prior work, showing an absolute gain of up to 7% in generalizing to unseen real-world classification tasks.
translated by 谷歌翻译
Existing Temporal Action Detection (TAD) methods typically take a pre-processing step in converting an input varying-length video into a fixed-length snippet representation sequence, before temporal boundary estimation and action classification. This pre-processing step would temporally downsample the video, reducing the inference resolution and hampering the detection performance in the original temporal resolution. In essence, this is due to a temporal quantization error introduced during the resolution downsampling and recovery. This could negatively impact the TAD performance, but is largely ignored by existing methods. To address this problem, in this work we introduce a novel model-agnostic post-processing method without model redesign and retraining. Specifically, we model the start and end points of action instances with a Gaussian distribution for enabling temporal boundary inference at a sub-snippet level. We further introduce an efficient Taylor-expansion based approximation, dubbed as Gaussian Approximated Post-processing (GAP). Extensive experiments demonstrate that our GAP can consistently improve a wide variety of pre-trained off-the-shelf TAD models on the challenging ActivityNet (+0.2% -0.7% in average mAP) and THUMOS (+0.2% -0.5% in average mAP) benchmarks. Such performance gains are already significant and highly comparable to those achieved by novel model designs. Also, GAP can be integrated with model training for further performance gain. Importantly, GAP enables lower temporal resolutions for more efficient inference, facilitating low-resource applications. The code will be available in https://github.com/sauradip/GAP
translated by 谷歌翻译
Few-shot (FS) and zero-shot (ZS) learning are two different approaches for scaling temporal action detection (TAD) to new classes. The former adapts a pretrained vision model to a new task represented by as few as a single video per class, whilst the latter requires no training examples by exploiting a semantic description of the new class. In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly. To tackle this problem, we further introduce a novel MUlti-modality PromPt mETa-learning (MUPPET) method. This is enabled by efficiently bridging pretrained vision and language models whilst maximally reusing already learned capacity. Concretely, we construct multi-modal prompts by mapping support videos into the textual token space of a vision-language model using a meta-learned adapter-equipped visual semantics tokenizer. To tackle large intra-class variation, we further design a query feature regulation scheme. Extensive experiments on ActivityNetv1.3 and THUMOS14 demonstrate that our MUPPET outperforms state-of-the-art alternative methods, often by a large margin. We also show that our MUPPET can be easily extended to tackle the few-shot object detection problem and again achieves the state-of-the-art performance on MS-COCO dataset. The code will be available in https://github.com/sauradip/MUPPET
translated by 谷歌翻译
Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.
translated by 谷歌翻译
Graph neural networks (GNNs) find applications in various domains such as computational biology, natural language processing, and computer security. Owing to their popularity, there is an increasing need to explain GNN predictions since GNNs are black-box machine learning models. One way to address this is counterfactual reasoning where the objective is to change the GNN prediction by minimal changes in the input graph. Existing methods for counterfactual explanation of GNNs are limited to instance-specific local reasoning. This approach has two major limitations of not being able to offer global recourse policies and overloading human cognitive ability with too much information. In this work, we study the global explainability of GNNs through global counterfactual reasoning. Specifically, we want to find a small set of representative counterfactual graphs that explains all input graphs. Towards this goal, we propose GCFExplainer, a novel algorithm powered by vertex-reinforced random walks on an edit map of graphs with a greedy summary. Extensive experiments on real graph datasets show that the global explanation from GCFExplainer provides important high-level insights of the model behavior and achieves a 46.9% gain in recourse coverage and a 9.5% reduction in recourse cost compared to the state-of-the-art local counterfactual explainers.
translated by 谷歌翻译
Lagrangian和Hamiltonian神经网络(分别是LNN和HNN)编码强诱导偏见,使它们能够显着优于其他物理系统模型。但是,到目前为止,这些模型大多仅限于简单的系统,例如摆和弹簧或单个刚体的身体,例如陀螺仪或刚性转子。在这里,我们提出了一个拉格朗日图神经网络(LGNN),可以通过利用其拓扑来学习刚体的动态。我们通过学习以刚体为刚体的棒的绳索,链条和桁架的动力学来证明LGNN的性能。 LGNN还表现出普遍性 - 在链条上训练了一些细分市场的LGNN具有概括性,以模拟具有大量链接和任意链路长度的链条。我们还表明,LGNN可以模拟看不见的混合动力系统,包括尚未接受过培训的酒吧和链条。具体而言,我们表明LGNN可用于建模复杂的现实世界结构的动力学,例如紧张结构的稳定性。最后,我们讨论了质量矩阵的非对角性性质及其在复杂系统中概括的能力。
translated by 谷歌翻译
计算机视觉和机器学习的进步使机器人能够以强大的新方式感知其周围环境,但是这些感知模块具有众所周知的脆弱性。我们考虑了合成尽管有知觉错误的安全控制器的问题。所提出的方法基于具有输入依赖性噪声的高斯过程构建状态估计器。该估计器为给定状态计算实际状态的高信心集。然后,合成了可证明可以处理状态不确定性的强大神经网络控制器。此外,提出了一种自适应采样算法来共同改善估计器和控制器。模拟实验,包括Carla中基于逼真的巷道示例,说明了提出方法在与基于深度学习的感知合成强大控制器中提出的方法的希望。
translated by 谷歌翻译
具有基于物理的诱导偏见的神经网络,例如拉格朗日神经网络(LNN)和汉密尔顿神经网络(HNN),通过编码强诱导性偏见来学习物理系统的动态。另外,还显示出适当的感应偏见的神经odes具有相似的性能。但是,当这些模型应用于基于粒子的系统时,本质上具有转导性,因此不会推广到大型系统尺寸。在本文中,我们提出了基于图的神经ode gnode,以了解动力学系统的时间演变。此外,我们仔细分析了不同电感偏差对GNODE性能的作用。我们表明,与LNN和HNN类似,对约束进行编码可以显着提高GNODE的训练效率和性能。我们的实验还评估了该模型最终性能的其他归纳偏差(例如纽顿第三定律)的价值。我们证明,诱导这些偏见可以在能量违规和推出误差方面通过数量级来增强模型的性能。有趣的是,我们观察到,经过最有效的电感偏见训练的GNODE,即McGnode,优于LNN和HNN的图形版本,即Lagrangian Graph Networks(LGN)和Hamiltonian Graph网络(HGN)在能量侵犯的方面差异,该图表的差异大约是能量侵犯网络(HGN)摆钟系统的4个数量级,春季系统的数量级约为2个数量级。这些结果表明,可以通过诱导适当的电感偏见来获得基于节点的系统的能源保存神经网络的竞争性能。
translated by 谷歌翻译